pythonA股实时数据分析
本章针对A股数据分析基础读者,本章以获取东方财富网A股数据、哪吒2背后公司光线传媒股票数据为例进行数据分析
声明:
本篇文章仅针对公开数据进行合法爬取,不违规使用与传播
章节:
这是python项目专栏--第一期(pythonA股实时数据分析)
前言:
本章针对A股数据分析基础读者,本章以获取东方财富网A股数据、哪吒2背后公司光线传媒股票数据为例进行数据分析,
应用场景:像政府采购网站,一般类似于东方财富网的表格数据
详细想要深入了解,可以阅读
python项目专栏--第二期(pythonA股实时数据分析(进阶版))
一、东方财富网A股数据获取
1.东方财富网首页-->沪深京-->沪深京个股-->沪深京A股


2025年6月15日这里总共有5723个上市公司的股票数据
2.打开开发者工具-->网络-->全部-->刷新界面-->搜索科力股份-->进入网址-->标头复制链接
构造请求头,请求该数据网页
import requests
import re
url='https://push2.eastmoney.com/api/qt/clist/get?np=1&fltt=1&invt=2&cb=jQuery37109326831773859228_1749957681848&fs=m%3A0%2Bt%3A6%2Cm%3A0%2Bt%3A80%2Cm%3A1%2Bt%3A2%2Cm%3A1%2Bt%3A23%2Cm%3A0%2Bt%3A81%2Bs%3A2048&fields=f12%2Cf13%2Cf14%2Cf1%2Cf2%2Cf4%2Cf3%2Cf152%2Cf5%2Cf6%2Cf7%2Cf15%2Cf18%2Cf16%2Cf17%2Cf10%2Cf8%2Cf9%2Cf23&fid=f3&pn=1&pz=20&po=1&dect=1&ut=fa5fd1943c7b386f172d6893dbfba10b&wbp2u=%7C0%7C0%7C0%7Cweb&_=1749957681852'
headers={'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Safari/537.36 Edg/137.0.0.0',
'Referer':'https://quote.eastmoney.com/center/gridlist.html',
'Cookie':'qgqp_b_id=1f6f4d1fe5f6dc768b0c86464ff4ca12; websitepoptg_api_time=1749954789719; st_si=15357815110768; st_asi=delete; fullscreengg=1; fullscreengg2=1; st_pvi=60193717251270; st_sp=2025-06-15%2010%3A33%3A09; st_inirUrl=https%3A%2F%2Fcn.bing.com%2F; st_sn=7; st_psi=2025061511212222-113200301321-0920417507'}
res=requests.get(url,headers=headers)
print(res.text)

3.根据请求的网页数据,用re正则表达式获取name、Code、price,用for循环打印
name = re.findall('"f14":"(.*?)","f15"', res.text)
code = re.findall('"f12":"(.*?)","f13"', res.text)
new_price = re.findall('"f2":(.*?),"f3"', res.text)
open = re.findall('"f17":(.*?),"f18"', res.text)
close = re.findall('"f18":(.*?),"f23"', res.text)
high = re.findall('"f15":(.*?),"f16"', res.text)
low = re.findall('"f16":(.*?),"f17"', res.text)
volume = re.findall('"f5":(.*?),"f6"', res.text)
amount = re.findall('"f6":(.*?),"f7"', res.text)
amplitude = re.findall('"f7":(.*?),"f8"', res.text)
price_limit = re.findall('"f3":(.*?),"f4"', res.text)
price_limit_amount = re.findall('"f4":(.*?),"f5"', res.text)
turnover_rate = re.findall('"f8":(.*?),"f9"', res.text)
for i in range(len(name)):
print(name[i],code[i],new_price[i],open[i],close[i],high[i],low[i],volume[i],amount[i],amplitude[i],price_limit[i],price_limit_amount[i],turnover_rate[i])

4.获取所有页的上市股票公司数据,并存储在A股数据获取.csv
url='https://push2.eastmoney.com/api/qt/clist/get?np=1&fltt=1&invt=2&cb=jQuery37109326831773859228_1749957681848&fs=m%3A0%2Bt%3A6%2Cm%3A0%2Bt%3A80%2Cm%3A1%2Bt%3A2%2Cm%3A1%2Bt%3A23%2Cm%3A0%2Bt%3A81%2Bs%3A2048&fields=f12%2Cf13%2Cf14%2Cf1%2Cf2%2Cf4%2Cf3%2Cf152%2Cf5%2Cf6%2Cf7%2Cf15%2Cf18%2Cf16%2Cf17%2Cf10%2Cf8%2Cf9%2Cf23&fid=f3&pn=1&pz=20&po=1&dect=1&ut=fa5fd1943c7b386f172d6893dbfba10b&wbp2u=%7C0%7C0%7C0%7Cweb&_=1749957681852'
通过地址发现pn=1为第一页数据,pn=2为第二页数据,pn=3为第三页数据,一共是第一页到第287页
import requests
import re
import pandas as pd
stock_list=[]
for pn in range(1,288):
url=f'https://push2.eastmoney.com/api/qt/clist/get?np=1&fltt=1&invt=2&cb=jQuery37109326831773859228_1749957681848&fs=m%3A0%2Bt%3A6%2Cm%3A0%2Bt%3A80%2Cm%3A1%2Bt%3A2%2Cm%3A1%2Bt%3A23%2Cm%3A0%2Bt%3A81%2Bs%3A2048&fields=f12%2Cf13%2Cf14%2Cf1%2Cf2%2Cf4%2Cf3%2Cf152%2Cf5%2Cf6%2Cf7%2Cf15%2Cf18%2Cf16%2Cf17%2Cf10%2Cf8%2Cf9%2Cf23&fid=f3&pn={pn}&pz=20&po=1&dect=1&ut=fa5fd1943c7b386f172d6893dbfba10b&wbp2u=%7C0%7C0%7C0%7Cweb&_=1749957681852'
headers={'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Safari/537.36 Edg/137.0.0.0',
'Referer':'https://quote.eastmoney.com/center/gridlist.html',
'Cookie':'qgqp_b_id=1f6f4d1fe5f6dc768b0c86464ff4ca12; websitepoptg_api_time=1749954789719; st_si=15357815110768; st_asi=delete; fullscreengg=1; fullscreengg2=1; st_pvi=60193717251270; st_sp=2025-06-15%2010%3A33%3A09; st_inirUrl=https%3A%2F%2Fcn.bing.com%2F; st_sn=7; st_psi=2025061511212222-113200301321-0920417507'}
res=requests.get(url,headers=headers)
# print(res.text)
name = re.findall('"f14":"(.*?)","f15"', res.text)
code = re.findall('"f12":"(.*?)","f13"', res.text)
new_price = re.findall('"f2":(.*?),"f3"', res.text)
open = re.findall('"f17":(.*?),"f18"', res.text)
close = re.findall('"f18":(.*?),"f23"', res.text)
high = re.findall('"f15":(.*?),"f16"', res.text)
low = re.findall('"f16":(.*?),"f17"', res.text)
volume = re.findall('"f5":(.*?),"f6"', res.text)
amount = re.findall('"f6":(.*?),"f7"', res.text)
amplitude = re.findall('"f7":(.*?),"f8"', res.text)
price_limit = re.findall('"f3":(.*?),"f4"', res.text)
price_limit_amount = re.findall('"f4":(.*?),"f5"', res.text)
turnover_rate = re.findall('"f8":(.*?),"f9"', res.text)
for i in range(len(name)):
stock = [name[i],code[i],new_price[i],open[i],close[i],high[i],low[i],volume[i],amount[i],amplitude[i],price_limit[i],price_limit_amount[i],turnover_rate[i]]
print(stock)
stock_list.append(stock)
data=pd.DataFrame(stock_list,columns=['公司名称','股票代码','最新价','开盘','收盘','最高','最低','成交量','成交额','振幅','涨跌幅','涨跌额','换手率'])
data.to_csv(r'E:\data_pachong\A股实时数据分析\A股原始底层数据获取.csv',index=False,encoding='gbk')

5.但是有些数据不符合原始数据格式,比如科力股份的最新价41.14,但是我们爬取的是4114
现在我们对爬取的原始数据进行数据处理,可以对爬取的数据用excel进行处理,比如C2/100,振幅=J2/100&"%"

6.当然这里是python专栏,推荐用python进行数据处理
import requests
import re
import pandas as pd
# 创建空列表存储股票数据
stock_list = []
for pn in range(1, 288): # 遍历288页数据
url = f'https://push2.eastmoney.com/api/qt/clist/get?np=1&fltt=1&invt=2&cb=jQuery37109326831773859228_1749957681848&fs=m%3A0%2Bt%3A6%2Cm%3A0%2Bt%3A80%2Cm%3A1%2Bt%3A2%2Cm%3A1%2Bt%3A23%2Cm%3A0%2Bt%3A81%2Bs%3A2048&fields=f12%2Cf13%2Cf14%2Cf1%2Cf2%2Cf4%2Cf3%2Cf152%2Cf5%2Cf6%2Cf7%2Cf15%2Cf18%2Cf16%2Cf17%2Cf10%2Cf8%2Cf9%2Cf23&fid=f3&pn={pn}&pz=20&po=1&dect=1&ut=fa5fd1943c7b386f172d6893dbfba10b&wbp2u=%7C0%7C0%7C0%7Cweb&_=1749957681852'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Safari/537.36 Edg/137.0.0.0',
'Referer': 'https://quote.eastmoney.com/center/gridlist.html',
'Cookie': 'qgqp_b_id=1f6f4d1fe5f6dc768b0c86464ff4ca12; websitepoptg_api_time=1749954789719; st_si=15357815110768; st_asi=delete; fullscreengg=1; fullscreengg2=1; st_pvi=60193717251270; st_sp=2025-06-15%2010%3A33%3A09; st_inirUrl=https%3A%2F%2Fcn.bing.com%2F; st_sn=7; st_psi=2025061511212222-113200301321-0920417507'
}
# 发送请求
res = requests.get(url, headers=headers, timeout=10)
res.raise_for_status() # 检查请求是否成功
# 使用正则表达式提取各字段数据
name = re.findall('"f14":"(.*?)","f15"', res.text)
code = re.findall('"f12":"(.*?)","f13"', res.text)
new_price = re.findall('"f2":(.*?),"f3"', res.text)
open_price = re.findall('"f17":(.*?),"f18"', res.text) # 今开
close_yesterday = re.findall('"f18":(.*?),"f23"', res.text) # 昨收
high = re.findall('"f15":(.*?),"f16"', res.text)
low = re.findall('"f16":(.*?),"f17"', res.text)
volume = re.findall('"f5":(.*?),"f6"', res.text)
amount = re.findall('"f6":(.*?),"f7"', res.text)
amplitude = re.findall('"f7":(.*?),"f8"', res.text)
price_limit = re.findall('"f3":(.*?),"f4"', res.text) # 涨跌幅
price_limit_amount = re.findall('"f4":(.*?),"f5"', res.text) # 涨跌额
turnover_rate = re.findall('"f8":(.*?),"f9"', res.text) # 换手率
# 处理每只股票的数据
for i in range(len(name)):
# 转换并格式化数据
# 价格类数据除以100转换为元
latest_price = float(new_price[i]) / 100
open_val = float(open_price[i]) / 100
close_yes = float(close_yesterday[i]) / 100
high_val = float(high[i]) / 100
low_val = float(low[i]) / 100
# 成交量转换为万手(保留2位小数)
volume_hand = float(volume[i]) / 10000
# 成交额转换为亿元(保留2位小数)
amount_yuan = float(amount[i]) / 100000000
# 涨跌幅、振幅、换手率除以100转换为百分比值
change_percent = float(price_limit[i]) / 100
amplitude_val = float(amplitude[i]) / 100
turnover_val = float(turnover_rate[i]) / 100
# 涨跌额除以100转换为元
change_amount = float(price_limit_amount[i]) / 100
# 创建股票数据列表
stock = [
name[i],
code[i],
round(latest_price, 2),
round(open_val, 2),
round(close_yes, 2),
round(high_val, 2),
round(low_val, 2),
round(volume_hand, 2),
round(amount_yuan, 2),
round(amplitude_val, 2),
round(change_percent, 2),
round(change_amount, 2),
round(turnover_val, 2)
]
stock_list.append(stock)
print(f"已处理: {name[i]}-{code[i]}")
# 创建DataFrame
columns = [
'名称', '代码', '最新价(元)', '今开(元)', '昨收(元)',
'最高(元)', '最低(元)', '成交量(万手)', '成交额(亿元)',
'振幅(%)', '涨跌幅(%)', '涨跌额(元)', '换手率(%)'
]
data = pd.DataFrame(stock_list, columns=columns)
# 保存为CSV文件
save_path = r'E:\data_pachong\A股实时数据分析\A股数据获取.csv'
data.to_csv(save_path, index=False, encoding='gbk')
print(f"数据已保存至: {save_path}")
print(f"共处理 {len(data)} 条股票数据")

7.这里也可以用akshare模块一键从数据库里获取股票数据(相比之前使用 requests 和正则表达式的方法,akshare 的方式更加简洁高效,且数据格式已经处理得很好,无需额外的数据清洗步骤)
import akshare as ak
data=ak.stock_zh_a_spot()
data.to_csv(r'E:\data_pachong\A股实时数据分析\A股实时数据获取akshare.csv',index=False,encoding='gbk')

二、光线传媒股票数据分析
1.这里以获取光线传媒历史股价数据为例(哪吒2爆火,股价有大变动,数据分析更显著)
import akshare as ak
data=ak.stock_zh_a_hist(symbol="300251",period="daily",start_date="20180101",end_date="20250531",adjust="qfq")
data.to_csv(r'E:\data_pachong\A股实时数据分析\光线传媒[300251]历史股价.csv',index=False,encoding='gbk')

2.读取光线传媒[300251]历史股价.csv,并将日期列作为索引
import pandas as pd
df=pd.read_csv(r'E:\data_pachong\A股实时数据分析\光线传媒[300251]历史股价.csv',index_col=0,encoding='gbk')
print(df)
3.这里以收盘价为例,计算收盘的5日均值、10日均值、30日均值、60日均值
import pandas as pd
import matplotlib
matplotlib.use('TkAgg') # 使用 Tkinter 后端,这是最常用的桌面后端
import matplotlib.pyplot as plt
plt.rcParams['font.family'] = ['SimHei'] # 设置中文字体
plt.figure(figsize=(12,8))
df=pd.read_csv(r'E:\data_pachong\A股实时数据分析\光线传媒[300251]历史股价.csv',index_col=0,encoding='gbk')
#5日均值、10日均值、30日均值、60日均值 走势图
df['收盘'].rolling(window=5).mean().plot(label='5日均值')
df['收盘'].rolling(window=10).mean().plot(label='10日均值')
df['收盘'].rolling(window=30).mean().plot(label='30日均值')
df['收盘'].rolling(window=60).mean().plot(label='60日均值')
plt.legend(loc='best')
plt.show()

4.假如只取开盘、收盘、高价、低价这四个列,因为它们处在同一量级上,所以可以直接进行可视化
import pandas as pd
import matplotlib
matplotlib.use('TkAgg') # 使用 Tkinter 后端,这是最常用的桌面后端
import matplotlib.pyplot as plt
plt.rcParams['font.family'] = ['SimHei'] # 设置中文字体
df=pd.read_csv(r'E:\data_pachong\A股实时数据分析\光线传媒[300251]历史股价.csv',index_col=0,encoding='gbk')
df=df[['开盘','收盘','最高','最低']]
df.plot(figsize=(12,8))
plt.show()

5.假如取开盘、收盘、高价、低价、成交额这五列,成交额与其它四个维度不在同一量级上,所以要先进行归一化再进行可视化绘图
import pandas as pd
import matplotlib
matplotlib.use('TkAgg') # 使用 Tkinter 后端,这是最常用的桌面后端
import matplotlib.pyplot as plt
plt.rcParams['font.family'] = ['SimHei'] # 设置中文字体
df=pd.read_csv(r'E:\data_pachong\A股实时数据分析\光线传媒[300251]历史股价.csv',index_col=0,encoding='gbk')
# 成交额单位体量与另外四个维度不在一个体量上
df=df[['开盘','收盘','最高','最低','成交额']]
#归一化
df_max_min=(df-df.min())/(df.max()-df.min())
df_max_min.plot(figsize=(12,8))
plt.show()

比如2025年初,成交额很高,但是股价也很高,买入的人也很多,符合现实2025年初哪吒2爆火,光线传媒股价暴涨的趋势
DAMO开发者矩阵,由阿里巴巴达摩院和中国互联网协会联合发起,致力于探讨最前沿的技术趋势与应用成果,搭建高质量的交流与分享平台,推动技术创新与产业应用链接,围绕“人工智能与新型计算”构建开放共享的开发者生态。
更多推荐


所有评论(0)